Deep-hidden Conditional Neural Fields for Continuous Phoneme Speech Recognition
نویسندگان
چکیده
We have proposed Hidden Conditional Neural Fields (HCNF) for automatic speech recognition and shown the effectiveness by continuous phoneme recognition experiments on the TIMIT and the Japanese ASJ+JNAS corpora. In this paper, we propose to use an observation function with a deep structure in HCNF. The proposed deep observation function enables to use the deep neural networks in HCNF, which have recently achieved remarkable success in Automatic Speech Recognition. We call the HCNF with the observation function with a deep structure Deep-HCNF. Experimental results of continuous phoneme speech recognition on the TIMIT and the ASJ+JNAS corpora showed that the Deep-HCNFs with monophone structure outperformed traditional tied-state triphone HMMs trained in MPE manner.
منابع مشابه
Hidden Conditional Neural Fields for Continuous Phoneme Speech Recognition
In this paper, we propose Hidden Conditional Neural Fields (HCNF) for continuous phoneme speech recognition, which are a combination of Hidden Conditional Random Fields (HCRF) and a MultiLayer Perceptron (MLP), and inherit their merits, namely, the discriminative property for sequences from HCRF and the ability to extract non-linear features from an MLP. HCNF can incorporate many types of featu...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملAcoustic Modeling Based on Deep Conditional Random Fields
Acoustic modeling based on Hidden Markov Models (HMMs) is employed by state-of-theart stochastic speech recognition systems. In continuous density HMMs, the state scores are computed using Gaussian mixture models. On the other hand, Deep Neural Networks (DNN) can be used to compute the HMM state scores. This leads to significant improvement in the recognition accuracy. Conditional Random Fields...
متن کاملEstimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
In hybrid hidden Markov model/artificial neural networks (HMM/ANN) automatic speech recognition (ASR) system, the phoneme class conditional probabilities are estimated by first extracting acoustic features from the speech signal based on prior knowledge such as, speech perception or/and speech production knowledge, and, then modeling the acoustic features with an ANN. Recent advances in machine...
متن کاملExploiting deep neural networks for detection-based speech recognition
In recent years deep neural networks (DNNs) – multilayer perceptrons (MLPs) with many hidden layers – have been successfully applied to several speech tasks, i.e., phoneme recognition, out of vocabulary word detection, confidence measure, etc. In this paper, we show that DNNs can be used to boost the classification accuracy of basic speech units, such as phonetic attributes (phonological featur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012